Component-Dependency based Micro-Rejuvenation Scheduling
نویسندگان
چکیده
With the growth of Internet, “always on” services are becoming increasingly important. Software rejuvenation is a well-known proactive technique to prevent failures due to software aging and extend the lifetime of longrunning software such as Internet Servers, billing systems, telecommunication switches [1]. However, doing a machine reboot to rejuvenate takes time in the order of minutes and can be expensive for large-scale Internet systems, even when clusters are employed[2]. To reduce the downtime caused by machine reboot, a fine-grained reboot technique called micro-reboot, which reboots components, was proposed in [2]. Micro-reboot has been shown to be an order of magnitude faster than machine reboot and systems can be rejuvenated by parts without ever doing a full reboot. Micro-rejuvenation [2] is a proactive technique that uses micro-reboot to prevent ageing failures. Since micro-rejuvenation involves rebooting components, it is necessary to rejuvenate a component’s transitive closure of dependents [2]. A complete rejuvenation schedule specifies the rejuvenation time instants so that the cost of rejuvenation is minimized. The complete rejuvenation can occur due to periodic or random timer trigger, transaction load trigger, or failure prediction trigger [1,3,4]. Similarly, micro-rejuvenation can occur due to all three of these triggers. However, a micro-rejuvenation schedule should specify both the rejuvenation time instants as well as the component(s) to be rejuvenated. The micro-rejuvenation schedules are less well-studied in the literature than complete rejuvenation. One approach to micro-rejuvenation scheduling proposed in [2] rejuvenates if utilization of system resources, such as memory, is above a certain threshold. The components are rejuvenated until the system resources reach a normal level. The components rejuvenated are selected in the order of amount of resources released in earlier rejuvenations. Another approach can be to adapt the complete rejuvenation schedule to micro-rejuvenation schedule. For example, if the complete-rejuvenation schedule is load-triggered, that is, the system is rejuvenated when the load is greater than a threshold; the adapted micro-rejuvenation schedule will use component load-triggered, that is, the component is rejuvenated if the component load is greater than a threshold. We refer these two approaches as simple scheduling, because both approaches take the simplified approach that all components are independent and will have the same cost in rejuvenating. Both the simple scheduling approaches are easy to implement, however, they do not take into account the dependency of the components and therefore, can yield non-optimum schedules. When a component is rejuvenated, all its dependent components are rejuvenated and therefore, any optimum schedule must take into account the fact that dependent components will be rejuvenated twice or more. In this paper, we propose a dependency-aware microrejuvenation scheduling policy that uses load-based trigger and rejuvenate independent components only. We develop a SAN model that closely reflects the real system and used it for evaluating our scheduling policy.
منابع مشابه
Optimal Rejuvenation Scheduling of Distributed Computation Based on Dynamic Programming
Recently, a complementary approach to handle transient software failures, called software rejuvenation, is becoming popular as a proactive fault management technique in operational software systems. In this study, we develop the optimal scheduling algorithms to trigger software rejuvenation in distributed computation circumstance. In particular, we focus on two different computation circumstanc...
متن کاملReliability-Based Software Rejuvenation Scheduling for Cloud-Based Systems
The reliability and availability of a cloud-based system play an important role in evaluating its system performance. Due to the promised high reliability of physical facilities provided for cloud services, software faults have become a major factor for failures of cloud-based systems. In this paper, we focus on the software aging phenomenon where system performance may be progressively degrade...
متن کاملUsing Micro-Reboots to Improve Software Rejuvenation in Apache Tomcat
As software complexity increases so does the difficulty in solving all software defects before the production stage, even with advanced software testing tools. Those software defects are often the cause for application crashes. To tolerate application crashes the industry has adopted several clustering techniques: server-redundancy, load-balancers and server-failover. The latest trend goes towa...
متن کاملSoftware Rejuvenation in Embedded Systems
Mobile communication devices have multitasking embedded software running in their operating systems (OS) as well as applications. Both the OS modules and the application components are assigned predetermined memory in those devices due to their near-realtime performance requirements. Memory (stack and heap) overflow problems occur in such software components because of programmer’s inability to...
متن کاملA comparative experimental study of software rejuvenation overhead
In this paper we present a comparative experimental study of the main software rejuvenation techniques developed so far to mitigate the software aging effects. We consider six different rejuvenation techniques with different levels of granularity: (i) physical node reboot, (ii) virtual machine reboot, (iii) OS reboot, (iv) fast OS reboot, (v) standalone application restart, and (vi) application...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008